Classifying free-text triage chief complaints into syndromic categories with natural language processing

نویسندگان

  • Wendy W. Chapman
  • Lee M. Christensen
  • Michael M. Wagner
  • Peter J. Haug
  • Oleg Ivanov
  • John N. Dowling
  • Robert T. Olszewski
چکیده

OBJECTIVE Develop and evaluate a natural language processing application for classifying chief complaints into syndromic categories for syndromic surveillance. INTRODUCTION Much of the input data for artificial intelligence applications in the medical field are free-text patient medical records, including dictated medical reports and triage chief complaints. To be useful for automated systems, the free-text must be translated into encoded form. METHODS We implemented a biosurveillance detection system from Pennsylvania to monitor the 2002 Winter Olympic Games. Because input data was in free-text format, we used a natural language processing text classifier to automatically classify free-text triage chief complaints into syndromic categories used by the biosurveillance system. The classifier was trained on 4700 chief complaints from Pennsylvania. We evaluated the ability of the classifier to classify free-text chief complaints into syndromic categories with a test set of 800 chief complaints from Utah. RESULTS The classifier produced the following areas under the ROC curve: Constitutional = 0.95; Gastrointestinal = 0.97; Hemorrhagic = 0.99; Neurological = 0.96; Rash = 1.0; Respiratory = 0.99; Other = 0.96. Using information stored in the system's semantic model, we extracted from the Respiratory classifications lower respiratory complaints and lower respiratory complaints with fever with a precision of 0.97 and 0.96, respectively. CONCLUSION Results suggest that a trainable natural language processing text classifier can accurately extract data from free-text chief complaints for biosurveillance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessing the performance of American chief complaint classifiers on Victorian syndromic surveillance data

Syndromic surveillance systems aim to support early detection of salient disease outbreaks, and to shed timely light on the size and spread of pandemic outbreaks. They can also be used more generally to monitor disease trends and provide reassurance that an outbreak has not occurred. One commonly used technique for syndromic surveillance is concerned with classifying Emergency Department data, ...

متن کامل

A Term-based Approach to Asyndromic Determination of Significant Case Clusters

Introduction Biosurveillance systems commonly depend on free-text chief complaints (CC)s for timely situational awareness. However, diagnosis codes may not be available soon enough and may have uncertain value because they are assigned for billing purposes rather than for population monitoring. Existing systems use syndrome categories to classify records based on these free-text fields. A syndr...

متن کامل

Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints

PURPOSE Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. Despite the fact that syndr...

متن کامل

Evaluation of preprocessing techniques for chief complaint classification

OBJECTIVE To determine whether preprocessing chief complaints before automatically classifying them into syndromic categories improves classification performance. METHODS We preprocessed chief complaints using two preprocessors (CCP and EMT-P) and evaluated whether classification performance increased for a probabilistic classifier (CoCo) or for a keyword-based classifier (modification of the...

متن کامل

Identifying ILI Cases from Chief Complaints: Comparing Keyword and Support Vector Machine Methods

The rapid spread of the novel H1N1 virus prompted Ottawa Public Health (OPH) to monitor Emergency Department Chief Complaints (EDCC) specifically for influenza-like illness (ILI). Note that data from ED visits is the most common data source for syndromic surveillance systems in the US [1]. METHODS Our data set was formed of 149910 case records composed of free text EDCC and accompanying patient...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Artificial intelligence in medicine

دوره 33 1  شماره 

صفحات  -

تاریخ انتشار 2005